perm filename NEWSEQ.DOC[COM,LSP] blob
sn#635646 filedate 1982-01-20 generic text, type C, neo UTF8
COMMENT ā VALID 00003 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002
C00024 00003 Random thought on NEWSEQ:
C00045 ENDMK
Cā;
PROPOSAL FOR KEYWORD-STYLE
SEQUENCE FUNCTIONS FOR COMMON LISP
Scott E. Fahlman
Carnegie-Mellon University
January 17, 1982
A problem in the version of Common Lisp proposed in the Swiss Cheese edition is
that the number of sequence and list grovelling functions is very large. Prior
to the November meeting I proposed that the number of these functions could be
greatly reduced through the judicious use of keywords to select some of the
options that otherwise would tend to multiply the number of built-in functions.
In this paper I will attempt to fill in the details of such a scheme.
First, let us assume that some scheme is adopted whereby keywords (symbols with
a colon in fromt of them) always evaluate to themselves. Thus, one can write
:FOO rather than ':FOO. This is not a necessary part of this scheme, but it
does make functions containing keywords look a lot cleaner.
Second, I have come around to the view that the type-specific forms of these
functions should not be part of the language as it is seen by the user; only
the generic forms should be supplied. If the user wants the greatest possible
execution speed or better error checking, he can declare the types of the
sequence arguments, either by storing these arguments in variables declared to
be of a specific type or by the use of the new THE construct (whatever name we
end up giving it). The compiler is then free make whatever use it can of this
additional information in order to generate better code. Thus, one would
normally write
(length foo)
where FOO can be any type of sequence, but could also use the form
(length (the :string foo))
for faster execution when the code is compiled.
[rpg: But what about interpreted code? Some people need to run interpreted
code to debug, so do you propose that the interpreter understand THE? This
won't buy any speed. I think we want the type specific ones just for this
reason. Remember, *you* don't have the S-1.]
In these generic functions, symbols are not acceptable as sequences and are not
coerced into strings. I see no good way to make this feature compatible with
generics of this type.
Sticking to generics helps a lot in reducing the number of functions, but there
are still too many. Most of the functions that do some sort of search through
the members of a sequence come in groups of five (or sometimes only three)
functions, according to what test or comparison function is to be used. On top
of that, many of the searches can be performed either forward or backward
through the sequence, multiplying the number of functions by two. These are
the options that I propose to handle with keywords rather than with individual
functions.
The following functions would be supplied only in generic form and would have
no need for keywords. They would remain as documented in the Swiss Cheese
edition, except that the type-specific forms would be eliminated: ELT, SETELT,
SUBSEQ, COPYSEQ, LENGTH, FILL, REVERSE, NREVERSE, SOME, EVERY, NOTANY,
NOTEVERY, SORT, STABLE-SORT, and MERGE. TO-LIST, TO-VECTOR, and friends would
also be provided in generic form only and would take no keywords.
REPLACE would probably also remain in its current form, though there are so
many optional args that we might want to turn this into a keyword function
along the lines described below. (Opinions?)
The MAP function would take a function argument, then a type-specifier argument
indicating the type of output sequence that is to be returned. (If this
argument is (), return no sequence, just discard the results.) Next, there
would be any number of sequence arguments, which may be a mixture of sequence
types. By requiring that the output type be explicitly specified, we eliminate
the ugly problem of finding a nice coercion hierarchy. Map would take no
keywords.
The CONCAT function would take any number of sequence arguments of the same
type and would return a new sequence of that type. If mixed sequences are to
be concatenated, they must coerced to the proper type with the TO-mumble
functions. The compiler can be smart about eliminating unnecessary steps in
this situation.
Idea: Suppose we eliminate CONCAT altogether, and generalize the TO-LIST family
of functions to take multiple arguments. These arguments would be sequences of
any mixture of types, and their elements would be concatenated into a sequence
of the type specified in the function name. (To accommodate all the funny
types of vectors, we might also want a general TO function that takes a
required type-specifier argument before the sequences.) With a single
argument, these would do a simple coercion, as before; if there is only one
argument and it is already of the right type, it is returned without being
copied. The point is that the type of the output sequence is specified with
this scheme, and the system does not have to try to figure out "the least
general sequence type among those the implementation provides which can contain
the elements of all the argument sequences", whatever that means. Opinions?
[rpg: This overloads TO-mumble too much, providing a functionality without
a corresponding mnemonic name.]
Now for the functions that require keywords. The general idea is to replace
each group of three or five related functions with a single function that would
take a couple of required arguments followed by optional keywords. These
functions would use the good names (e.g. "REMOVE" instead of "REM") and
without any keywords would default to forward direction and EQUAL test. Thus,
they upward compatible with existing functions in Maclisp. (Were it not for
this compatibility issue, EQL would have been a better default, but a change
here would cause more trouble than it is worth.)
Under this scheme there will be no need for the old EQ forms since REMQ, for
example, is just REMOVE with the :COMPARE #'EQ option, but we should probably
include the common ones for convenience and compatibility. REMQ, MEMQ, DELQ,
and ASSQ are probably worth the redundancy.
The following keywords will be used in a consistent way through all the
keyword-taking functions. Not all of the keywords are applicable to all of the
functions -- see the individual function descriptions for a list of which
keywords are applicable.
:from-end This keyword takes no arguments. It turns any function into
the corresponding "from-end" version. If it is not present, of
course, the direction defaults to forward.
The following four keywords are mutually exclusive:
:compare predicate
The predicate is the two-argument comparison function that is
to be used in testing items against members of the sequence.
The default predicate is EQUAL.
:compare-not predicate
Like :compare, but the result of the comparison predicate is
negated before it is used.
:if predicate The predicate is a one-argument function that is used to test
members of the sequence.
:if-not predicate
Like :if, but the result of the predicate is negated before it
is used.
:start index If present, the examination or processing of the sequence
begins here. Else, processing begins at index 0. If there are
two sequence arguments and the :start keyword is used, it
provides a starting index for both sequences.
:end index If present, the examination or processing of the sequence ends
at this index (not inclusive). Else, processing proceeds to
the end of the sequence. If there are two sequence arguments
and the :end keyword is used, it provides an ending index for
both sequences.
If there are two sequence arguments and they are to be provided with different
indices, use :start1, :start2, :end1, and :end2. All take a
single index argument.
A particularly ugly case would look like this:
(position foo-element (the :vector foo-vector)
:compare-not #'eql :start 10 :end 20)
A more typical case would look like this:
(position foo-element foo-vector :compare #'eql)
The following sequence functions use the above keywords:
REMOVE item sequence &optional count <keywords>
Accepts :from-end, :compare, :compare-not, :if, :if-not.
POSITION item sequence <keywords>
Accepts :from-end, :compare, :compare-not, :if, :if-not,
:start, :end.
COUNT item sequence <keywords>
Accepts :from-end, :compare, :compare-not, :if, :if-not,
:start, :end.
MISMATCH sequence1, sequence2 <keywords>
Accepts :from-end, :compare, :compare-not, :start, :end,
:start1, :start2, :end1, end2.
MAXPREFIX sequence1, sequence2 <keywords>
Accepts :from-end, :compare, :compare-not, :start, :end,
:start1, :start2, :end1, end2.
MAXSUFFIX sequence1, sequence2 <keywords>
Accepts :from-end, :compare, :compare-not, :start, :end,
:start1, :start2, :end1, end2.
SEARCH sequence1, sequence2 <keywords>
Accepts :from-end, :compare, :compare-not, :start, :end,
:start1, :start2, :end1, end2.
In addition to the sequence functions listed above, the following list-only
functions can use the same keyword scheme to get rid of the groups of three or
five functions.
MEMBER item list <keywords>
Accepts :compare, :compare-not, :if, :if-not.
DELETE item list &optional count <keywords>
Accepts :compare, :compare-not, :if, :if-not.
ADJOIN item list <keywords>
Accepts :compare, :compare-not.
UNION &rest lists <keywords>
Accepts :compare, :compare-not.
NUNION &rest lists <keywords>
Accepts :compare, :compare-not.
INTERSECTION &rest lists <keywords>
Accepts :compare, :compare-not.
NINTERSECTION &rest lists <keywords>
Accepts :compare, :compare-not.
SET-DIFFERENCE list1 list2 <keywords>
Accepts :compare, :compare-not.
NSET-DIFFERENCE list1 list2 <keywords>
Accepts :compare, :compare-not.
SET-XOR list1 list2 <keywords>
Accepts :compare, :compare-not.
NSET-XOR list1 list2 <keywords>
Accepts :compare, :compare-not.
ASSOC item alist <keywords>
Accepts :compare, :compare-not, :if, :if-not.
RASSOC item alist <keywords>
Accepts :compare, :compare-not, :if, :if-not.
MEMASSOC item alist <keywords>
Accepts :compare, :compare-not, :if, :if-not.
MEMRASSOC item alist <keywords>
Accepts :compare, :compare-not, :if, :if-not.
POSASSOC item alist <keywords>
Accepts :compare, :compare-not, :if, :if-not.
POSRASSOC item alist <keywords>
Accepts :compare, :compare-not, :if, :if-not.
DELASSOC item alist &optional count <keywords>
Accepts :compare, :compare-not, :if, :if-not.
DELRASSOC item alist &optional count <keywords>
Accepts :compare, :compare-not, :if, :if-not.
Idea: Add the keyword :eq, which is short for :compare #'eq. This could be
used in any form that accepts the :compare keyword. Similarly, add :neq, :eql,
:neql, :equal, :nequal, :equalp, and :nequalp. This would eliminate unpleasant
verbosity in the vast majority of cases, though the added keywords might slow
things down a bit in interpreted code. Opinions?
[rpg: This argument adds weight to the functional position macro idea, which is
to put all the keywords in a macro in the lambda position, which then expands
to some awful internal name that can that displacedly stick around.]
One awkward issue is raised by the use of :if and :if-not. Since these are
one-place predicates, the item argument is not used. It would beautify the
code to eliminate the item argument if the :if keyword is present. Thus, we
would have
(member some-list :if #'ugly-element-p)
instead of
(member ignored-arg some-list :if #'ugly-element-p)
Unfortunately, it is mechanically very awkward to squeeze out a
normally-required argument that precedes other required args, and the
documentation might prove quite confusing to the user. The alternatives would
seem to be keeping the ignored argument (the user would put an explicit NIL
there) or flushing the :if construct altogether. I lean toward the latter.
Comments? Ideas?
[rpg: The macro idea fixes this too, I think. The functional macro idea is
also nice because the keywords describe the function to use in the
function call more than the function call.
Having te keyword more closely associated with the object they modify seems
like a good thing. This proposal mixes up arguments to a function with
a description of the function. The above example would be:
((member :if #'ugly-element-p) some-list)]
Random thought on NEWSEQ:
. . .
Second, I have come around to the view that the type-specific forms of these
functions should not be part of the language as it is seen by the user; only
the generic forms should be supplied. If the user wants the greatest possible
execution speed or better error checking, he can declare the types of the
sequence arguments, either by storing these arguments in variables declared to
be of a specific type or by the use of the new THE construct (whatever name we
end up giving it). The compiler is then free make whatever use it can of this
additional information in order to generate better code. Thus, one would
normally write
(length foo)
where FOO can be any type of sequence, but could also use the form
(length (the :string foo))
for faster execution when the code is compiled.
[rpg: But what about interpreted code? Some people need to run interpreted
code to debug, so do you propose that the interpreter understand THE? This
won't buy any speed. I think we want the type specific ones just for this
reason. Remember, *you* don't have the S-1.]
. . .
Idea: Suppose we eliminate CONCAT altogether, and generalize the TO-LIST family
of functions to take multiple arguments. These arguments would be sequences of
any mixture of types, and their elements would be concatenated into a sequence
of the type specified in the function name. (To accommodate all the funny
types of vectors, we might also want a general TO function that takes a
required type-specifier argument before the sequences.) With a single
argument, these would do a simple coercion, as before; if there is only one
argument and it is already of the right type, it is returned without being
copied. The point is that the type of the output sequence is specified with
this scheme, and the system does not have to try to figure out "the least
general sequence type among those the implementation provides which can contain
the elements of all the argument sequences", whatever that means. Opinions?
[rpg: This overloads TO-mumble too much, providing a functionality without
a corresponding mnemonic name.]
Idea: Add the keyword :eq, which is short for :compare #'eq. This could be
used in any form that accepts the :compare keyword. Similarly, add :neq, :eql,
:neql, :equal, :nequal, :equalp, and :nequalp. This would eliminate unpleasant
verbosity in the vast majority of cases, though the added keywords might slow
things down a bit in interpreted code. Opinions?
[rpg: This argument adds weight to the functional position macro idea, which is
to put all the keywords in a macro in the lambda position, which then expands
to some awful internal name that can that displacedly stick around.]
. . .
One awkward issue is raised by the use of :if and :if-not. Since these are
one-place predicates, the item argument is not used. It would beautify the
code to eliminate the item argument if the :if keyword is present. Thus, we
would have
(member some-list :if #'ugly-element-p)
instead of
(member ignored-arg some-list :if #'ugly-element-p)
Unfortunately, it is mechanically very awkward to squeeze out a
normally-required argument that precedes other required args, and the
documentation might prove quite confusing to the user. The alternatives would
seem to be keeping the ignored argument (the user would put an explicit NIL
there) or flushing the :if construct altogether. I lean toward the latter.
Comments? Ideas?
[rpg: The macro idea fixes this too, I think. The functional macro idea is
also nice because the keywords describe the function to use in the
function call more than the function call.
Having te keyword more closely associated with the object they modify seems
like a good thing. This proposal mixes up arguments to a function with
a description of the function. The above example would be:
((member :if #'ugly-element-p) some-list)]
-rpg-